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Abstract 

In this work we develop a dynamically adaptive sparse grids (SG) method for quasi-optimal 
interpolation of multidimensional analytic functions defined over a product of one dimensional 
bounded domains. The goal of such approach is to construct an interpolant in space that corre¬ 
sponds to the “best M-terms” based on sharp a priori estimate of polynomial coefficients. In the 
past, SG methods have been successful in achieving this, with a traditional construction that relies 
on the solution to a Knapsack problem: only the most profitable hierarchical surpluses are added 
to the SG. However, this approach requires additional sharp estimates related to the size of the 
analytic region and the norm of the interpolation operator, i.e., the Lebesgue constant. Instead, we 
present an iterative SG procedure that adaptively refines an estimate of the region and accounts for 
the effects of the Lebesgue constant. Our approach does not require any a priori knowledge of the 
analyticity or operator norm, is easily generalized to both affine and non-affine analytic functions, 
and can be applied to sparse grids build from one dimensional rules with arbitrary growth of the 
number of nodes. In several numerical examples, we utilize our dynamically adaptive SG to inter¬ 
polate quantities of interest related to the solutions of parametrized elliptic and hyperbolic PDEs, 
and compare the performance of our quasi-optimal interpolant to several alternative SG schemes. 

1 Introduction 

This paper considers constructing an approximations to multidimensional analytic functions defined 
over a product of one dimensional bounded domains. The main challenge facing all methods in 
this context is the curse of dimensionality, i.e., the computational complexity of approximation 
techniques increases exponentially with the number of dimensions. To alleviate the curse, methods 
have been proposed that reduce the dimensionality of the problem mus], reduce the complexity 
the target function mu. or approximate the function in an optimal polynomial subspace null]. 

We take the latter approach and we build upon the recent results in best M-terms approximation 
[5l[9l[29], where the function is projected onto the polynomial space associated with the dominant 
coefficients of either a Taylor or Legendre expansion. In implementation, finding the optimal space 
is intractable and instead sharp a priori estimates of the expansion coefficients are used to select a 
quasi-optimal space. Such approach can achieve sub-exponential convergence rate in the context of 
both projection, e.g., [DmHH] and interpolation, e.g., however, the quasi-optimal methods 

rely heavily on a priori estimates of the size of the region of analyticity of the function and sharp 
estimates are available only in few special cases. 

Given a suitable polynomial space, orthogonal projection results in the best approximation, 
however, the projection approach often times comes at a heavy computational cost [HIIlllH]- In 
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contrast, sampling based techniques require only the values of the function at a set of nodes, e.g., 
Monte Carlo random sampling for computing the statistical moments of a function [niEi], and 
Sparse Grids (SG) method for high order polynomial approximation [I^[23l[24] . which is the focus 
on this work. SG sampling does not result in best approximation in the associated polynomial 
space and the error is magnified by the norm of the SG operator, a.k.a., the Lebesgue constant. 
However, sampling tends to be computationally cheaper than projection as well as more susceptible 
to parallelization which usually offsets the moderate increase of the error. In addition, sampling 
procedures can wrap around simulation software that computes single realization of the function, 
which simplifies the implementation and allows the use of legacy and third party code. 

Sparse grids algorithms construct multidimensional function approximation from a linear com¬ 
bination of tensors of one dimensional interpolation rules. Quasi-optimal SG are traditionally 
constructed as the solution to a Knapsack problem [SlEl], where the selected set of tensors is as¬ 
sociated with the largest profit index that is derived from an a priori estimate of the hierarchical 
surplus, the Lebesgue constant, and the number of samples in a tensor. In the case when the one 
dimensional rules grow by one node at a time, a near optimal greedy procedure using the Taylor 
coefficients of the function can construct a suitable approximation however, without a priori 

assumptions, selecting the optimal set of coefficients comes at a very high computational cost. 

In this work, we present an iterative procedure for constructing a sequence of SG interpolants 
with increasing number of nodes and accuracy, that does not require a priori estimates of the region 
of analyticity. We focus our attention to the nested SG case, where all nodes associated with one 
grid are also utilized by the next grid in the sequence, thus reusing all available samples. We 
review popular one dimensional nested rules such as Clenshaw-Gurtis [7] and Leja laiiQ] and we 
present several new rules based on greedy minimization of operator norms. In addition, for any 
chosen rule and any arbitrary lower (i.e., admissible 0) polynomial space, we present a strategy 
for selecting the minimal set of tensors that yields an interpolant in that space. Every interpolant 
in the sequence is constructed using this strategy, which circumvents the Knapsack problem and 
allows us to restrict our attention to the selection of the optimal polynomial spaces. 

The quasi-optimal polynomial space associated with Legendre coefficients is a total degree space 
with a small logarithmic correction mm- However, while the Legendre space is optimal with 
respect to projection, in the context of interpolation, the quasi-optimal estimate does not account 
for the effect of the Lebesgue constant. Using estimates of the operator norm of the one dimensional 
rules, we add a strong correction to the total degree space to arrive at a an estimate for the quasi- 
optimal interpolation space. Our estimate is parametrized by two vectors associated with the size 
of the analytic region of the function and the growth of the Lebesgue constant of the interpolation 
rules. 

In order to keep our approach free from a priori assumptions, we present a procedure for dy¬ 
namically estimating the two vector parameters. For each interpolant in the sequence, we consider 
the orthogonal decomposition of the interpolant into a linear combination of multivariate Legen¬ 
dre polynomials. Then, we seek the vectors that give the best fit of our quasi-optimal estimate 
to the decay rate of the Legendre coefficients, i.e., using least-squares approach. The polynomial 
space used for the construction of the next interpolant in the sequence is optimal with respect to 
the parameters inferred from the previous interpolant. The number of additional nodes in each 
interpolant can be chosen arbitrarily, however, few nodes result in more frequent update of the 
parameter vectors which leads to better accuracy, while larger number of nodes allows for greater 
parallelization. 

The procedure for estimating the quasi-optimal polynomial space can be coupled with any 
approximation strategy that satisfies a mild assumption regarding the growth of the Lebesgue 
constant. One potential alternative is to use interpolation based on Fekete points, however, even in 
moderate dimensions, finding those points involves an ill-conditioned and prohibitively expensive 
problem. Other popular alternatives are the optimization based methods that construct the an 
approximation based on minimization of (e.g., least-squares |14j ) or (e.g., compressed sensing 
m) norms. Those methods can be applied to sets of random samples, however, the number 
of samples needed to construct the approximation always exceeds the cardinality of the optimal 
polynomial space. We assume that we can choose the abscissas for each samples and we want to 
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exploit the fact that the range of an interpolation operator has exactly the same degrees of freedom 
as the number of interpolation nodes. Thus, the sparse grids interpolants are best suited for our 
context. 

The rest of the paper is organized as follows, in 31]we derive an estimate of the quasi-optimal 
interpolation space and we present an iterative procedure for generating a sequence of quasi-optimal 
polynomial spaces. In ^ we present a strategy for constructing sparse grids operators with minimal 
number of nodes and we present several one dimensional interpolation rules. In ^we present several 
numerical examples. 

2 Quasi-optimal polynomial space 

We consider the problem of approximating a multivariate function f{y) : F —>■ R, where T C is 
a d-dimensional hypercube, i.e., T = Ffc and without loss of generality we let T^ = [—1,1]. 

We assume that f{y) admits holomorphic extension to a poly-ellipse in complex plane, i.e., 

Assumption 1 (Holomorphic extension) For a vector p € with pk > 1, the map z —>■ f{z) 
is holomorphic in an open neighborhood of the poly-ellipse 

= U ^ jzfc e C : |jR(2:fc)| < cos(6>), |S(zfc)| < sin(6>)| (1) 

6»G[0,27r] l<fc<d ^ 

where and 3(zfe) indicate the real and complex part of Z}~. 

Due to this assumption, we aim at approximaiting f{y) with globally defined polynomials over T. 

To achieve this goal, we introduce a multivariate polynomial space 

^A(p)(r) = span{y'' : v € A(p)} , 

in which it will be convenient to use multi-index notatior{3, where A(p) is a sequence of lower 
multi-index sefl A global polynomial approximation of f{y) in ’PA(p)(r) has the form 

/a(p) = 'y ^ 
i^£A(p) 

where span{(l)^{y) : u G A(p)} = PA(p)(r), and the choice of (fuiv) and c^, is method specific. To 
alleviate the curse of dimensionality, the polynomial space 7^A(p)(r) should be chosen with as few 
degrees of freedom necessary to approximate f{y) with sufficient accuracy. Using Assumption [T1 
let Qffc = log(pfe) for 1 < fc < d dictate the anisotropy in each direction, then the most common 
choice of the polynomial space is the tensor product space 7^A'rP(p)(r), where 

= {iz e N"* : max < p}, 

l<k<d 

and the cardinality depends exponentially on the dimension. Several alternative polynomial spaces 
have been proposed, namely total degree space with A^^(p) = hyperbolic cross 

section A^‘^{p) = {/z G : (zz -|- 1)“ < p} and Smolyak A^”^{p) = {/z G : a • log 2 (iz -b 1) < p}, 
where • indicates vector dot product and log(iz -|- 1) = + 1) (see [18] and references 

therein). 

Our approach is motivated by recent work on best M-term quasi-optimal Galerkin approxima¬ 
tion [n[8l[28]. Consider the orthogonal decomposition of f{y) 

f{y) = X] 

^For the remainder of the paper we let N be the set of natural numbers including zero, and A, 0 C N'^ will denote 
set of multi-indexes. For any two vectors, we define v'k' with the usual convention 0° = 1. 

^ A set A is caller lower or admissible if zz G A implies {i G N‘^ : i < zz} c A, where i < zz if and only if ik < for 
all 1 < fc < d. 
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where Li,[y) are the multivariate Legendre polynomials. For any lower set A(p) the projection of 
f{y) onto PA(p)(r) is given by /a(p)(2/) = CuLu{y) and the approximation error is 

\]f-fAip)\\h= E 

i/^A(p) 

Therefore, the best M-term space for projection is the space associated with the M largest coeffi¬ 
cients. The coefficients are not known a priori, however, when Assumption [T] is satisfied, a sharp 
upper bound to \c,^\ is given by 


d 

|ci/| < C ■ exp{—a ■ u) n y/2vu 1, (2) 

fe=l 

for some constant C DUS]. The quasi-optimal projection space is then 
indexes that yield the largest upper estimate, i.e., 

f 1 

A“(p) = log(j^fc + 0.5) < 

I k=l 

where we derive the condition by taking the log of the right hand side of ([2]), changing the sign and 
ignoring the constant. In this context, p € N is an arbitrary variable used to discretize the index 
space into levels. 

2.1 Quasi-optimal interpolation 

We are interested in constructing approximation using a computationally cheap sampling scheme. 
Projection results in optimal T^(r) error, however, interpolatory approximations are not optimal. 
Let A(L) be a lower set and let fA(L){y) be in interpolatory approximation of f{y) in PA(L)(r), 
then 

ll/-/A(L)lk“ < (1+Ca(l)) inf ||/-p||ioo, (3) 

pe7’A(i,)(r) 

where Ca(l) is the Lebesgue constant, i.e., norm of the interpolation operator, that manifests as an 
additional penalty term. Observe that 


associated with the multi- 


(4) 

L“ 

therefore, the dominant polynomial space associated with the infimum term in ([3|) can be (heuris- 
tically) approximated by estimate ([2]), i.e., the difference comes only from using L^{T) as opposed 
to the L°°{r) norm. However, to define a proper quasi-optimal interpolation space, ([2]) has to be 
combined with an estimate of the Lebesgue constant. Here we make the following assumptior0: 

Assumption 2 (Lebesgue constant) Let C^, indicate the Lebesgue constant associated with the 
smallest lower set that contains u, i.e., then we assume that 

d 

Cu<c^l[{’^k + iy\ ( 5 ) 

k^l 


inf II/-P|U“ < 

p6PA(i)(r) 


/- E ' 

i'eA(L) 


for some constants Cj and {'yk)i<k<d- 


®In Sj3]we derive an estimate for the Lebesgue constant associated with our sparse grids construction and we demon¬ 
strate that Assumption [2] is indeed satisfied. 


4 








Multiplying ([2]) by (O, and using that Vk + 0.5 < + 1 we arrive at the estimate 

d d 

Cu - C ■ exp{—OL ■ v) \/2vk + 1 < C ■ Cj ■ V2 ■ exp{—a ■ v) [vu + ( 6 ) 

k=l k=l 

As before, taking the log of ([HI), reversing the sign and ignoring the constants, we define the quasi- 
optimal interpolation space 

A“’^(A) = {iv e : a • /^ + /3 • log(i/ + 1 ) < L} , (7) 

where /3fe = — 7 ^ — The vectors a and f3 give rise to a discretization of the multi-index space, 
however, the notion of levels does not transcend the specific a and /3, since © depends on the 
scaling of the vector components. Given parameter vectors and a sequence of levels ^ 

we can construct the corresponding quasi-optimal approximations /a“.^(l„)( y)i however, finding 
suitable ex and /3 is of consideration. 

Remark 1 (Preserving the property lower) The entries of (3 in eould be negative, hence, 
is not necessarily a lower set. Lower sets have the advantage that the polynomial space 
associated with the multi-indexes is independent from the hierarchical basis used, e.g., Legendre 
basis, monomials, Newton polynomials. We are interested in constructing a sequence of lower sets, 
thus, i/A“’^(L) is not a lower set, we replace & with 

K<-P{L) = y {3 3 < 1 ^} 

2.2 Estimating the parameters 

Sharp a priori bounds of the region of analytic extension of f{y) and the corresponding <x are 
seldom available, likewise estimates of the Lebesgue constant are either overly conservative or the 
constant fluctuates in a wide range and predicting the effective jk is very difficult. Thus, we need 
a procedure to dynamically estimate the effective parameters ex and /3 that give the best fit of ® 
to the behavior of f{y). 

Assume that we have constructed an interpolant fA{L){y) for some lower set A(L). Here A(L) 
could be chosen according to 0 for some ex and (3, or according to total degree, hyperbolic cross 
section or Smolyak formulas [18]. Since fA{L)[y) S 'Pa(l) there are coefficients c^ for u € A(L) 
such that 

/A(L)(y) = X! 

i^GA(L) 

where Li,{y) are the multidimensional Legendre polynomials. By orthogonality, each of the coeffi¬ 
cients is 

= J fA{L){y)Lu{y)dy, ( 8 ) 

where the integral can be computed with a multidimensional quadrature rule and note this can 
be done without additional evaluations of f{y). Since Li,{y) and fA(L)iy) are polynomials, it is 
sufficient to use a quadrature that can integrate exactly all polynomials in P 2 A{l)- 
We assume that \ci,\ decay at a rate guided by ® for some ex and (3, i.e., 

\ci.\(x exp{-cx-u)(v+ 1)~^ log(|c^|) Ri-C - a • 1 /-/3 • log(i/-I-1), (9) 

for some constant C. In (jOj), all c^, are known and we can solve for ex, (3 and C, however, the decay 
rate of the coefficients is not monotone and hence we look for the parameters that give the “best 
fit”. Here best is used in sense, i.e., we infer approximate a and (3 from the solution to the 
convex minimization problem 

^ X! (c' + a-i^ + /3-fog(iz-(-l)-f log(|c^|)) ( 10 ) 

i/eA(L) 
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For sufficiently large A(L), (fTOl) admits a unique solution. However, the accuracy of the estimated 
a and 0 strongly depends on the size of A(F), in fact, estimates obtained using this least squares 
approach are valid only for sets close to A{L). Furthermore, the constant C is not used in (O and 
since Cexp{—a. ■ u){u + 1)“^ does not give an upper bound on the coefficients, C cannot be used 
to estimate the true approximation error; in this context, C plays the role of a dummy variable. 

Remark 2 (Ad hoc stability constraint) In general, \ci,\ do not decay monotonically and if 
A(L) is small, some of the estimated parameters in a could be negative. The entries of a are 
associated with the convergence rate in different direction and negative entries indicate that for this 
specific direction and specific A(L) we are observing divergence. In case of a negative otk, we use 
an ad hoc correction, where we replace the negative entries with the smallest positive one, i.e., we 
assume that in the limit the diverging direction will in fact converge, albeit slowly. 

The least-squares approach allows us to construct a sequence of polynomial spaces indexed by 
sets {A„}5 )Tq. We start with either Ag = A““’^“(Lo) for some initial guess of ag, /3o and Lg, or Ag 
can be chosen according to total degree, hyperbolic cross section or Smolyak formulas. Then from 
A„ and the interpolant /A„(y), we infer the best fit parameters q:„+i and f3n+i and construct the 
next index set as 

A„+i = A„|jA“-+i’^"+i(A„+i), (11) 

where we take the union to ensure that the sequence is nested and A„+i controls the number of 
additional degrees of freedom in the polynomial space. Using small L^+i, i.e., taking the smallest 
Ln+i such that A“'*+i’^”+'^(L„+i) ^ A„, leads to a more frequent update of the parameters a and 
(3 and hence better accuracy, however, larger Ln+i leads to more samples needed for constructing 
/a„+i (y) and hence more opportunity for parallelization. 

Next, we present a specific sparse grids strategy for constructing /A„(y)- 


3 Sparse grids interpolation 

In this section we present a general sparse grids interpolation approach, which consists of evalu¬ 
ating f{y) at a set of nodes yi,--- ,ym S F and constructing an interpolant f\[L){y) S Pa(l)) 
where A(L) is an arbitrary lower index set, i.e., not necessarily constructed according to ([7]). The 
properties of the SG interpolant are determined by the Lebesgue constant and growth of nodes in 
the one dimensional family of interpolants that induce the grid. Minimizing the number of nodes 
is desirable and ideally we want the number of samples to not exceed the cardinality of A(L), 
however, interpolants with more nodes often times result in smaller operator norm and potentially 
more accurate approximation. For a given A(L) and one dimensional rule, we present a strategy 
for constructing the SG with smallest number of nodes that produces an interpolant in ’Pa(l) • We 
also present several novel one dimensional rules. 

3.1 Constructing optimal interpolant 

A nested one dimensional family of interpolation rules is defined by a distinct sequence of nodes 
G [—1,1] and a strictly increasing growth function m : N —>■ N, i.e., m(0) > 0 and 
m{l -I- 1) > m{l). For I = 0,1,2, - ■ ■ we define the l-th level interpolation operator 

m{l) 

^m(i) : C0(r) ^ 1]), by U^‘[g]iy) = ^ g{yM]{y), 

j=i 


whereg(y) G C°([-l, 1]), the Lagrange basis functions are V^j(?/) = and P„p)_i([-1,1]) 

span{y'^ : 0 < < m(?) — 1}. In addition, for / > 0, we define the surplus operators = 

Um.{0 _ and for notational convenience let A° = Several specific examples of yj 

and TO/ are listed in Table [1] in 1 13.31 Note, we are explicitly assuming the interpolants have nested 
nodes and strictly increasing m{l), see Remarks [3] and [H 
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Taking the d-dimensional tensors, we have 

d d d 

m(i) = 0m(4) : = = 

fc=l k=l k=l 

where G N^, y G T and is evaluated at the corresponding fc-th component of y. The tenor 
operators are given by 

d d 

Arri(i) ^ U^(^)[f]{y)= f{yj)^}{y) 

k—1 k—1 

Note that by the telescoping properties of we have that = J2j<i i.e., each full 

tensor operator can be decomposed into a sum of surplus operators. 

Every multidimensional polynomial space can be included in the range of a full tensor interpo¬ 
lation operator, however, full tensors have a rigid structure and often times require an excessive 
number of samples. Sparse grids offer a flexible alternative, where the interpolant is constructed 
from a sparse set of the surplus operators 0(T), i.e., 

^e(L)= E (12) 

iee(L) 

For any lower index set 0(T), (TT^ is an interpolation operator with nodes {yj}j£ 0 ^{L)j where 

QmiL) = y {j G N"* : 1 < j < m(i)} = y {j € : rrii^i + 1 < j < m{i)} (13) 

iee{L) ie0(L) 

and /e(i)[/] G T’e™(L)-i with QmiL) - 1 = {j G N"* : j -f 1 G Qm{L)} [22]. By definition of 
there exists a set of integer weights {ij}jee(L)i satisfying the system of equations J2j<i j^ 0 {L) ~ ^ 

for every i G 0(T), and the interpolant can be written explicitly as 

i0(L){f]iy)= E •^(%) E ti'^jiy) (14) 

j€Om{L) ie0{L),m{i)<j 

Interpolant (fTTl) is constructed from the samples f{yj) for j G 0m(T), and since the sets of the 
second union of (1131) are disjoint, we have a direct relationship between 0(T), m(l) and the number 
of nodes. 

Remark 3 If the one dimensional family of rules is not nested then generally hlS\) is not an inter¬ 
polant. Even in the non-nested case, operator of the form can produce an accurate approxima¬ 
tion to f{y) l 2 iA\ 24 ^ , however, the approximation belongs to a space of polynomials with cardinality 
much less than the required number of function evaluations. Excess sampling unnecessarily increases 
the computational cost and thus we restrict our attention to nested rules. 

Next, we present the main result of this section, where we consider the optimal choice of tensor 
set that will guarantee the range of the interpolation operator includes a given polynomial space. 

Theorem 1 (Minimal polynomial interpolant) Let A(L) be a lower set of polynomial multi¬ 
indexes and define 

0°^*(L) = {i G N'^ : m (i-l)GA(L)}, (15) 

where for notational convenience we let m{—l) = 0. Then, 'Pa(l) C respect to 

the particular m{l), Q°p*{L) is minimal, i.e., if Q{L) is a lower set such that C ’P 0 „(l)- 1 ; 

then 0(A) is a superset o/0°p*(L). 
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Proof: For arbitrary j £ A(L), by monotonicity of rn{l), there is i such that m(ik — 1) < jk < 
m{ik) — 1 for = 1 , 2 , • • • , d. Then, according to (fTKll 

— 1 ) < J * G 0 °^‘(L), and j < m{i) — 1 =» j G — 1 . 

To show that is minimal, suppose 0(T) is a lower set and A(L) C Qrn{L) — 1. For 

arbitrary i £ 0°p*(L), let j = m(i — 1), then 

j = Tn(i — 1) j £ A(L) =4> j £ 0m(L) — 1 => exists i' £ 0{L) s.t. j < m{i') — 1. 

Thus, m{i — 1) < m{i') — 1 and by monotonicity of m{l) we have i < i'. Since i' £ 0{L) and 
0{L) is a lower set, i £ 0(A). □ 

Corollary 1 For K°^'P{L) defined in the optimal tensor set is 

0“’^(L) = {i G : CK • m{i — 1) + f3 ■ log(m(i — 1) + 1) < L} (16) 

Armed with this result, we define the sparse grids inteprolants associated with the sequence 
of polynomial spaces defined in (ITT|) . Let 0o be the optimal tensor set associated with Aq, then 
/Ao(y) = l 0 o[f]iy) and for n G N 

0 „+i = 0„|j0“"+i’^"+i(L„+i), and /A„+i(y) =/e„+i[/](y) 

where cin+i and (3n+i are estimated from (llOll . Note that unless m{l) = 1 + 1 the interpolants con¬ 
structed in this fashion will be associated with polynomial spaces larger than Va„ ■ The traditional 
Knapsack approach uses m{l) explicitly into the definition of the profit index [^122]. however, in 
our context, rules with m{l) > Z -|- 1 are associated with smaller Lebesgue constant and thus mil) 
affects A„ implicitly through the (3 term. 

Remark 4 Isotropic total degree space defined by X)fc=i Jk < L is an example of a particular 
polynomial space of interest, in fact, large amount of the initial work in sparse grids was aimed 
at construeting total degree interpolants. The work \25f give an optimal construction using one 
dimensional rules with occasionally repeating number of nodes (as opposed to strictly increasing). 
However, Corollary Q] with a = 1 and (3 = 0 gives us the same result as the slow-growth method 
and hence the assumption m(l) < m{l + 1 ) is not restrictive. 

3.2 Lebesgue constant 

In the one dimensional context, the norm of can be estimated numerically and, in some 

cases, sharp theoretical estimates are also available. Thus, we assume there exists G R so 

that < Ai, and for specific examples see T1.3I However, even if A; is sharp, there is 

no known sharp analytic estimate of for a general 0(A) and numerical estimates are 

computationally impractical. In the case when A/ exhibits polynomial growth, i.e.. A/ < Cj(l + 1)'*' 
for some Cj,"/ £ R, Lemma 3.1 in [5] shows that 

\\IeiL)\\co<C^{mL))^^^ (17) 

where f(0{L) indicates the umber of elements in 0(A). This result is not sharp and the effective 
operator norm is usually much smaller than what (13 suggests, however, estimate CZl) indicates 
that the two main factors contributing to the operator norm are the growth of A; (i.e., Cg, and 7 ) 
and the cardinality of 0(A). 

The polynomial growth of A; is sufficient to satisfy Assumption [2] The norm of a full tensor 
operator is < 0^=1 ffi® associated polynomial space is indexed by {i>' G : 

V < m{i) — 1}. By monotonicity of m{l), if i is the index of the smallest full tensor containing v, 
then i <v and 

d d d 

c < n < ^7 n ^ n + 1)" 

fc=l k^l 
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( 18 ) 


Lebesgue constant 




Figure 1: Interpolation operator norm for different sequence of points. Left: the two dashed lines 
correspond to 3\/r+T and dyT+T. Right: the reference lines are |(/ + 1) and ^ log(2* + 1). 


Note that (IlSp implies that 7 fc is independent from k, however, a one dimensional family of rules 
can exhibit very slow increase of the Lebesgue constant for the first several nodes, followed by much 
sharper increase. Thus, when /(y) exhibits strong anisotropic behavior (i.e., the components of a. 
differ significantly), there is a large discrepancy in the number of one dimensional nodes associated 
with various directions, and different 7 fc yield a much sharper estimate. 

3.3 One dimensional rules 

In this section, we present multiple one dimensional interpolation rules with different nodes yj, 
growth sequence mi and operator norm A/. Ideally, we want a rule with slowly growing m(l) and 
A/, since small m; results in fewer interpolation nodes and smaller A; leads to better accuracy of 
the approximation, however, those are competing goals. Rules with rapidly increasing m[l) lead to 
0ji with small cardinality and interpolants with small Lebesgue constant. However, rapid growth 
of m{l) also leads to interpolants with more nodes than degrees of freedom of A„. It is hard to 
predict a priori the optimal relation between m(Z) and A/, and in what follows, we present a list of 
several candidate interpolation rules. 

The roots and extrema of Chebyshev polynomials are a popular choice for interpolation nodes 
and the Chenshaw-Curtis [7] rule is one of the most widely used. The Chenshaw-Curtis nodes yj 
are dehned by 

yi = 0, y 2 = 1, y 3 = -l, forj>3, = cos (^ 2 “ri°S 20 -i)l - 3 ) 7 r^ , (19) 

where [a;] = min{i G Z : i > x}. The growth sequence starts with to( 0) = 1 and for Z > 0 we have 
m{l) = 2* + 1. The operator norm increases logarithmically in number of nodes 

Xi = — log(m(Z) - 1) + 1 = - log( 2 ') + 1 . (20) 

TT TT 

Another similar example is the Fejer type-2 interpolation based on the interior roots of Chebyshev 
polynomials m- The nodes are given by 


yj = cos ^2 d° 82 (i+i)l (2j + l) 7 r^ , ( 21 ) 

with growth sequence m{l) = 2*+^ — 1. Similar to Clenshaw-Curtis, the operator norm exhibits 
logarithmic dependence on m{l). 
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Remark 5 For interpolation rules that exhibit logarithmic increase of Xi, replacing + 1)"*' in 
Assumption\^with\og{yk-\-\-) may yield a sharper estimate. However, the effects of the logarithmic 
term are negligible even for relatively small cx. See Figure\^in 0 

A nested rule based on Chebyshev nodes and a slower growing m; is the 7?.-Leja sequence 
presented in [5]. Let {Ok}^^i be the sequence defined recursively by 


6 »i = 0 , 6 » 2 = 7 r, ^3 = ^, forj>3, 


9j-i + TT, j is odd 
j is even 


( 22 ) 


then the 72.-Leja nodes are yj = cos{9j) with m{l) = 1 + 1, and A/ < 2{l + 1)^ log(Z + 1). The single 
point growth gives great flexibility, since the number of nodes needed to construct /a„(2/) always 
equals the number of indexes in A„, however, the operator norm of 72.-Leja based interpolants is 
usually larger than Clenshaw-Curtis and Fejer rules. 

The quadratic estimate of the Lebesgue constant for the 7^-Leja sequence is sharp for the worst 
case, however, the actual penalty fluctuates between quadratic and logarithmic (in the number of 
nodes), which gives rise to a family of rules with different m{l) [ 6 ]. First, we define the centered 
7?.-Leja sequence as 

yi=0, y 2 = 1, y3 = -l, 2 /i = cos( 6 »j), (23) 

where 9j is defined as in (|2^ . With the centered rule, if we select m(/) = 2^ + 1 (and m(0) = 1), 
then the resulting one dimensional interpolants are identical to those defined by the Clenshaw- 
Curtis rule. In general, 7?.-Leja sub-sequences with exponentially growing m{l) exhibit linear (in 1) 
increase of A;. Two specific examples include the 7?.-Leja double-2 rule defined by 


m(0) = 1, m(l) = 3, for I > 1, m(/) 


2L^J+i 



+ Ij 


(24) 


and the 7?.-Leja double-4 rule defined by 


m(0) = 1, m(l) = 3, 


for I > 1 , m{l) = 2^"*"^ 4 J ^1 H- - — 



(25) 


where [xj = max{i £ Z : i < x}. The word double in the name refers to the fact that I appears in 
the exponents of 2 (i.e., we are doubling the number of nodes) and the numbers 2 and 4 refer to the 
delay of the doubling (i.e., m{l -I- 2) — 1 = 2{m{l) — 1) and m{l -I- 4) — 1 = 2{m{l) — 1) respectively). 
Yet another option is to use the odd sub-sequence, namely m{l) = 21 + 1. The 7?.-Leja odd rules 
result in interpolants with symmetric distribution of nodes and slightly lower operator norm. See 
Tabled] for a summary of the 7?.-Leja rules. 

The 7^-Leja sequence is constructed as the solution to a greedy optimization problem defined 
on the unit disk in the complex plane, the resulting complex nodes are projected onto the real line. 
The optimization problem can also be dehned on [—1,1] as j/i = 0 and 


j 

yj+i = argmax TT 12 / - yi I , (26) 

where, if the optimization admits more than one solution, we take the right-most one. This con¬ 
struction leads to Leja interpolation m and the number of points in Leja interpolants can grow 
one at a time, i.e., m(/) = 2 -|- 1, or we can use only the odd rules, i.e., m{l) = 21 + 1. Unlike 
the 7^-Leja sequence, there is no known sharp estimate of the operator norm of Leja interpolants, 
however, numerical tests shows that for 2 < 50 we can take A; ~ 3{l -f 1)^/^. 

Alternative greedy sequences can be constructed by replacing (1261) with maximization of the 
Lebesgue function 33 , 

Vj+i = argmax ^ 
ye[-i.i] 

minimization of 


n 


y yi 


93' - Vi 


(27) 
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Name 

mi 


Vk 

Clenshaw-Curtis 

1 ,2' + 1 

flog(2' + l) 

see (fT^ 

Fejer type 2 

2 i+i _ 1 

1 log(2^+i - 1) 

see (f^ 

7?.-Leja 

1 -\- \ 

1.5(1+ 1) 

cos(0fc) (see [22]) 

7?.-Leja double-2 

see (fMl) 

1.5(1+ 1) 

see ([2^ 

7?.-Leja double-4 

see (f25]i 

1.5(1+ 1) 

see (f23]l 

7?.-Leja odd 

2 / -b 1 

3(1 -b 1) 

see ([25|1 

Leja 

/ + 1 

3^/^+T 

see (f26]l 

Leja odd 

2 / -b 1 

eVT+T 

see (pSjl 

max-Lebesgue 

/ + 1 

4ViTT 

see (f27|l 

max-Lebesgue odd 

2 / -b 1 

SvT+T 

see (f^ 

min-Lebesgue 

/ + 1 

4ViTT 

see (f28]l 

min-Lebesgue odd 

2 / -b 1 

Sv'T+T 

see (f28]l 

min-delta 

/ + 1 

‘iVTTi 

see (l2^ 

min-delta odd 

2 / -b 1 

Qy/m 

see ([2^ 


Table 1: Summary of one dimensional rules. Note that the increase of the Lebesgue penalty A/ is 
mesured empirically form the first 50 nodes of the corresponding sequence. 


Ui+i = argmin max TT ^ 
vef-i.il y'e[-i.i] f , 




ye[-i.i] 

or minimization of 


y-Vi 


E 


y -y 


tTi ' %■' - y 


n 


y - Vi 


Vj' - Vi 


3 3 

Vi+i = argmm max I 1 + > II 


y - y3' 


yi - yr 


n 

t'=i 


\y -y3' 


y - yr 


(28) 


(29) 


In (I27l) - (l2^ the growth can be prescribed either as a one or two points (i.e., m{l) = 1 + 1 or 
m{l) = 21 + 1). Numerical estimates of A; for each sequence are given in Figure [T] 


Remark 6 (Alternative interpolation rules) Each of the optimization problems 
can be redefined over a multidimensional domain and a greedy procedure can be devised that con¬ 
structs an interpolant for an arbitrary A(L). However, The optimization problem is ill-conditioned 
and feasible only for few dimensions and moderate cardinality of A{L), and therefore not practi¬ 
cal. A notable exception is the Magic Points procedure \2(ff . which is an extension of to a 
multidimensional domain of arbitrary geometry, and the method can be used for non-polynomial 
interpolation. However, in the case of a hypercube F and a lower polynomial space, the functional 
associated with the Magic Points greedy problem is a product of one dimensional functionals of type 
and the maximum is a tensor of one dimensional Leje nodes. Thus, one possible realization 
of the Magic Points algorithm is a sparse grid induced from Leja nodes. 


3.4 Summary of our method 

Our dynamically adaptive sparse grids interpolaiton strategy is summarized in the following algo¬ 
rithm: 

Algorithm 1 (Anisotropic Dynamically Adaptive Multidimensional Approximation) 

Given target function f{y) £ C*^(r), where T is a hypercube in 

Select one dimensional nodes and growth rule m{l) (e.g., form Table\^ 

Select initial Kq, e.g., total degree, hyperbolic cross section or Smolyak formulas. 

Let 00 = 0Q^* according to U5\) 
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for n = 0,1,2, ■■■ to oo do 

Construct /a„ (y) = Iq^ [f] [y) according to {H, ^.e., compute the necessary f{yj) 

Compute the coefficients cfp according to 

Solve the minimization problem m for OLn+i and f3n+i 

If necessary, apply the ah hoc correction of Remark\^ 

Let 0„+i = 0„ U 0“”+'^’^"+! (L„+i) according to \16\) 

If necessary, find Ln+i large enough to exploit parallelism 
end for 

Evaluating /a„ (y) is computationally cheap and the coefficients Ci, are computed using only 
/A„(y), however, for a high dimensional problem this process can still take 10-20 minutes on 6 -core 
CPU. A potentially cheaper alternative is to represent the interpolant using Newton hierarchical 
polynomials. We define the hierarchical basis 


hi{y) = 1, and for j > 1, hj{y) = ^ , Hj{y) = (30) 

y* fe=i 

Then, for any lower tensor set 0(T) and Qm{L) as defined in (fT^ . there are surplus coefficients 
{■Sj}jGem(L) satisfying the system of equations = fiUi) for every i e Qm{L) and 

the interpolant can be written as 


^e(L)[/](y)= E (31) 

jGe™(L) 

The surplus coefficients are much cheaper to compute than the Legendre coefficients, and Sj can 
be used as an alternative to Cv When the one dimensional growth functions is to(/) = / -|- 1, then 
we can infer a and /3 from 

\ E (c + Cfj+f3- log(j 4- 1 ) -b log(|sj D) (32) 

i.e., using Sj in place of c^, in m- However, this approach is not suited for the case when m(l) > 
I -b 1. Any interpolant can be written in surplus form, and the sub-ordering of {j/j}”!® (;-i)-i-i does 
not affect the nodes, polynomial space, or Lebesgue constant. However, the surpluses do depend 
on the sub-ordering, and therefore, Sj are influenced by Aj associated with growth m{l) =1 + 1 
rather than m{l) > 1 -b 1 . 

Remark 7 When &{L) is a lower index set the three constructions and m result in 

identical interpolants. In addition, the associated polynomial space is determined solely by 0(L) 
and the growth sequence m[l), i.e., the choice of nodes does not affects the range of the operator 
(only the norm). The three formulas can be generalized to not lower tensor sets &{L), however, 
if 0(L) is not a lower set, then the three approximations are not equal and only \81\) gives an 
interpolant. Furthermore, when 0(L) is not lower, the associated polynomial spaces depend on the 
specific choice of yj. In our experiments, interpolants constructed from not lower tensor sets are 
less accurate and we restrict our attention to lower sets 0(L). 


4 Numerical results 

In this section we present several numerical examples using functions /(y) that depend on the 
solutions to discretized linear and nonlinear parametric PDEs. We compare the convergence rates 
of SG interpolants for different selections of the tensor sets, including total degree space and the 
0„ constructed using Algorithm [T] We also test the performance of the one dimensional rules from 
Table [2 
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4.1 Parametrized elliptic equation 

The first three examples use similar setup involving the parametrized elliptic equation defined by 

j -Vx-{a{x,y)V^u{x,y)) =b{x), xeD , , 

\ u{x,y) =0, X G dD, ' 

where y G T and D = [0,1] ® [0,1]. The parametrized coefficient a{x, y) is such that 

0 < min a{x, y) < maxa(a;, y) < oo, 

x,y x,y 

in which case for every y GT exists a unique u{x,y) G Hq(T>) that satisfies (I55)) . e.g., [M]. For a 
bounded functional Q : Mq(V) —?► R we define the quantity of interest (Qol) 

f{y) = Q{u{x,y)) : r R. 

The first three numerical examples differ only in the choice of a{x,y), b{x) and Q{u{x,y)). The 
above setup has been the thoroughly studies in literature, e.g., [Tl[5l|8l|9l[23l[24] , and makes for a 
good test bed of novel techniques. 


4.1.1 Karhunen-Loeve expansion 

Let (r2,J^, P) denote a complete probability space, with sample space 12, cr-algebra T = 2^ and 
probability measure P : — >■ [0,1]. For x G [0,1] and y G il. Let ai(a:, y) define a random field 

with covariance function 


Cov 


log(ai — 0.5) {x, x') = exp ( — 


{x — x')" 


LFsing Karhunen-Loeve expansion, we parametrize the random field using the dominant seven eigen¬ 
values and eigenfunctions 


ai{x,y) ~ 0.5-1- exp 1 -|- 




yi 












where y is uniformly distributed over F [^[53]. Assuming that the diffusion coefficient in (1551) 
depends only on the hrst component of cc, we define ai{x,y) = ai(xi,y). The source term is 
bi{x) = cos(a;i) sin(a; 2 ) and the quantity of interest is the L'^{D) norm of the solution u{x,y) 



v?{x, y)dx 


1/2 


y e F C 


(34) 


Analytic expression for /i(y) is not available, however, for a given yj G F, we can approximate 
(1331) with a finite element method (e.g., m) and compute a sufficiently accurate approximation to 
fi{yj)- The error associated with this additional approximation is beyond the scope of this paper 
and we focus our attention to the discretized fi{y)- 

Figure [2] shows a comparison of eight different types of interpolation methods applied to /i(y). 
The left plot shows the results using the Clenshaw-Curtis rule, while the right plot uses the Leja 
nodes. Each plot shows the results from four different selections of the tensor set and the error is 
estimated from 1000 uniformly distributed random samples 


error = , max |/e„[/i](yi) - ■ 

l<i<1000 


(35) 


The isotropic case corresponds to the construction of isotropic total degree polynomial space, 
i.e., as defined in Remark|31 However, the e~^ term in (??) decays fast with increasing k and thus 
yk variables corresponding to larger k will have lesser effect on the overall variation of /i(y). The 
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Figure 2; Applying four different construction of interpolants for fi{y). Left: using Clenshaw-Curtis 
nodes. Right: using Leja nodes. 


isotropic approach fails to capture the different behavior of yk and hence it is the worst performing 
scheme for this example. Anisotropic SG method has been proposed in [23) and using ai{x,y) 
anisotropic weights are derived 

a = (0.85,0.80,0.80,1.0,1.0,1.6,1.6), 
and the analytic anisotropic polynomial space is defined by 

A“(L) = {zy € : a • jy < L}. 

Using A“(L), we construct the corresponding interpolant defined in Theorem (T) the results are 
plotted as the dashed lines in Figure [2] 

For comparison purposes. Figure [2] also plots a Dynamic Total Degree example that uses a 
modified version of Algorithm [TJ In the total degree case, we remove the /3 term from the weight 
function m and the least-squares estimate m, and we compute only a. For both Clenshaw- 
Curtis and Leja nodes, the total degree approach captures the anisotropic behavior of fi{y) and 
the method exhibits faster convergence. The estimated normalized parameters o: are given on Table 
[21 Since the dominant polynomial space is independent from the scaling of a, we divide all entries 
of a by the average ylAji. Both Clenshaw-Curtis and Leja nodes result in nearly identical a even 
though Clenshaw-Curtis parameters are based on the projection of the interpolant ([5|) while the 
Leja parameters are computed using the hierarchical surpluses (1321) . The Clenshaw-Curtis nodes 
have a lower Lebesgue constant, however, the Clenshaw-Curtis nodes result in grids with more 
nodes due to the exponential growth of m{l). For this particular problem using dynamic total 
degree nodes, the lower operator norm of Clenshaw-Curtis nodes leads to faster convergence. 

Figure 121 also shows the results from applying Algorithm [1] to fi{y), and the curve is labeled 
Dynamic Curved. When using Clenshaw-Curtis nodes, there is no significant difference between the 
total degree method and the quasi-optimal interpolation ©■ The estimated (normalized) decay 
parameters a and (3 are listed in Table |3| and we see that (3 is relatively small. In contrast, 
when using the Leja nodes, the quasi-optimal estimate 0 leads to a significant improvement in 
convergence and from Tabled we see that the 0 estimated parameters are smaller compared to the 
Clenshaw-Curtis ones, which is due to the larger Lebesgue constant. 

Figure [3] shows the results of applying Algorithm [T] to fi{y) and using different rules from 
Table [TJ The curves are not smooth since the results are affected by error in the least-squares 
approximation, fluctuations of the operator norm (see Figure |T|), the random samples used to 
compute the error (I55|) . and, when m{l) >1-1-1 the range of the resulting interpolant is a super-set 
of the estimated optimal polynomial space. However, all interpolants have similar convergence rate 
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Dimension 

Oil 

012 

03 

04 

ds 

de 

dij 

Clenshaw-Curtis 

0.58 

0.70 

0.57 

0.99 

0.93 

1.57 

1.66 

Leja 

0.61 

0.70 

0.58 

0.96 

0.91 

1.55 

1.68 

Analytic 

0.78 

0.73 

0.73 

0.92 

0.92 

1.46 

1.46 


Table 2: Estimated a parameters for fi{y) using total degree estimate. 


Dimension 

di 

d2 

ds 

d4 

ds 

dg 

d7 

Clenshaw-Curtis 

0.64 

0.77 

0.80 

0.97 

0.99 

1.48 

1.34 

Leja 

0.67 

0.80 

0.85 

1.08 

1.05 

1.43 

1.12 

Dimension 

/Si 

/32 

/Ss 



/Se 

$7 

Clenshaw-Curtis 

-0.46 

-0.51 

-0.94 

-0.31 

-0.47 

-0.29 

0.09 

Leja 

-1.00 

-1.14 

-1.68 

-1.24 

-1.27 

-0.96 

-0.21 


Table 3: Estimated ci; and /3 parameters for fi{y). 


o 

.5 



Leja 

Leja odd 
Max-Lebesgue 
Max-Lebesgue odd 
Min-Lebesgue 
Min-Lebesgue odd 
Min-Delta 
Min-Delta odd 
R-Leja 
R-Leja odd 
R-Leja Double-2 
R-Leja Double-4 
Clenshaw-Curtis 


6000 8000 10000 12000 14000 16000 18000 

Number of samples 


Figure 3: Algorithm [T] applied to fi{y) using all rules from Tabled) 


Dimension 

(5di 

6a2 

(5d3 

5d4 

Jds 

5/3i 

6(32 

6/33 

6/34 

6/35 

Clenshaw-Curtis 

0.80 

0.45 

0.82 

0.50 

0.67 

1.27 

0.88 

0.93 

1.70 

0.89 


Table 4; Relative variation of a and (3 parameters for fi{y) over all rules from Table [T]- 
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Figure 4: Mesh for the inclusion problem. Each disk has radius ^ and the centers are at 0.2, 0.5 and 
0.8 along the horizontal and vertical axis. The square box is located at the center and has side of 0.2. 

within the “noise” in the method. Table |4] gives the relative variation of a and (3 components 
for all rules in Table [U where 6 ak is computed as the difference between the largest and smallest 
estimate for ak divided by the average (same for /3fc). From the first 5 components, we see that 
the components of (3 vary over a much wider range than those of a, since the a is affected by 
only by the “noise” while (3 is also affected by the different Lebesgue constants for the different 
rules. Note that the table lists only the first 5 components, since and yr are associated with the 
smallest eigenvalue of the Karhunen-Loeve expansion, fi{y) is least sensitive to ye and j/ 7 , and the 
constructed interpolants have few nodes in those two directions, which leads to unreliable estimates. 


4.1.2 Inclusion problem 

In this section we apply our approach to two inclusion problems, presented in [T], where we consider 
the domain given on Figure ID 


Isotropic case 

The (almost) isotropic inclusion problem associates each of the eight disks with a component of 
y S ®fc=i[0-01, 1 ] and we define 


a2{x,y) 


yk, xGBiskk , , 
1, otherwise. 


100, X G Box 
0, otherwise. 



u{x, y)dx 


Figure [5] shows a plot of the convergence rate of interpolants based on Clenshaw-Curtis and Leja 
nodes. Figure [5] compares three possible selections of the polynomial space, isotropic total degree 
(see Remark [U, dynamic curved (i.e.. Algorithm [T]) , and dynamic total degree (i.e., Algorithm [T] 
ignoring f3). The disks are not equidistant from the box and thus the components of y do not 
have equal influence on / 2 (y), however, the anisotropy is weak and the performance of the the 
anisotropic total selection is indistinguishable from the isotropic interpolant. On the other hand, 
our quasi-optimal approach has a significantly better convergence. Table [5] gives the final values 
of the estimated a and [3 parameters, similar to the Karhunen-Loeve example, the a parameters 
are almost identical for both Clenshaw-Curtis and Leja nodes, however, the (3 parameters for the 
Leja nodes are smaller due to the larger operator norm. Despite the seemingly small values of (3, 
the added flexibility of the log term in ([7]) results in accurate interpolants with thousands of fewer 
nodes as compared to the total degree construction. 


16 




Error 




Number of samples Number of samples 


Figure 5; Applying three different construction of interpolants for f 2 {y)- Left: using Clenshaw-Curtis 
nodes. Right: using Leja nodes. 


Dimension 

ai 

0:2 

as 

0:4 

05 

ae 

C(7 

as 

Clenshaw-Curtis 

0.86 

1.32 

0.67 

1.32 

1.25 

0.67 

1.26 

0.66 

Leja 

0.94 

1.32 

0.66 

1.32 

1.22 

0.66 

1.22 

0.66 

Dimension 

k 

h 

k 

/34 

k 

k 

k 

k 

Clenshaw-Curtis 

0.66 

-0.49 

1.41 

-0.50 

-0.20 

1.40 

-0.20 

1.47 

Leja 

0.17 

-1.24 

0.75 

-1.24 

-0.86 

0.74 

-0.86 

0.79 


Table 5: Estimated ci; and /3 parameters for / 2 (y). 


It is interesting to note that several of the j3k estimates in Table [5] are in fact positive, as 
opposed to the negative values suggested in the theoretical derivation of (O and Assumption [2j 
This indicates that the coefficients Cj exhibit a combination of exponential and algebraic decay 
and the added flexibility of (3 (i.e., imposing no restictions on the individual results in a irrore 
accurate estimate of the optimal polynomial space. Thus, our approach is applicable to a wider 
class of functions, namely those obeying estimate GD for any /3, albeit rigorous characterization of 
this class of functions is not currently available. 

Anisotropic case 

The second variation of the inclusion problem uses only the four corner disks 

^top-left ^ [0.109,1], ytop-right ^ [0.2575,1], 

2^bottom-left ^ [0.010,1], 2^bottom-right ^ [0.4060,1]. 

The diffusion coefficient is constant 1 outside the four boxes. The four parameter forcing term and 
quantity of interest are now defined over the entire domain, i.e., 63 ( 0 ;) = 100 and 

/ 3 (y) = [ u{x,y)dx. (36) 

J D 

Despite the smaller number of dimensions, the four parameter problem is more difficult, due to the 
smaller amin and smaller region of analyticity. For this problem, anisotropic total degree weights 
have been derived 

a = (40,317,137,227), (37) 
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0 2000 4000 6000 8000 10000 12000 14000 16000 18000 
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Figure 6 : Applying three different construction of interpolants fsiy). Left: using Clenshaw-Curtis 
nodes. Right: using Leja nodes. 


Dimension 

ai 

02 

ds 

0:4 

01 

02 

03 

04 

Clenshaw-Curtis 

0.18 

2.38 

0.18 

1.27 

12.70 

10.64 

13.62 

12.00 

Leja 

0.24 

2.22 

0.24 

1.30 

6.95 

5.69 

8.75 

6.94 


Table 6 : Estimated a and (3 parameters for fz{y)- 


and numerical results show that (1371) are reasonably accurate when used in the context of Galerkin 
projection [T]. Figure [5] shows a comparison in the convergence rate between interpolants con¬ 
structed via the total degree weights given in (1371) and the application of Algorithm [1] Table [5] lists 
the estimated a and 0 parameters. The convergence rate of the interpolants is closer to algebraic, 
leading to a large /3, which dominates for indexes with small entries. Thus, in the context of in¬ 
terpolation, the total degree space estimated by the weights (1571) is far from optimal. Initially, the 
least squares method struggles to estimate the quasi-optimal parameters, however, once a sufficient 
number of samples are computed, the convergence rate of Algorithm [T] dramatically increases lead¬ 
ing to enormous savings compared to the total degree selection. Furthermore, due to the small 
region of analyticity of fz{y), the difference in Lebesgue constant between Clenshaw-Curtis and 
Leja has a significant effect with the former method outperforming the latter. 


4.2 Parametrized steady state Burgers equation 


Ub{x) = 


Consider the steady state Burgers equation defined over the domain D = [0,1]® [0,0.5]\[0.15,0.25]® 
[0.15,0.35] (see Figure [71). The right most wall of the domain is associated with homogenous 
Neumann boundary conditions, i.e., dDn = {1}® [0,0.5], the remainder of the boundary, i.e., dDd-, 
is associated with the Dirichlet conditions defined by 

16x2(0.5 - X2), xi = 0 , . 

0, otherwise. ^ 

A time dependent control problem using the above domain is described in m- In our context, we 
are interested in parametrizing the steady state Burgers equation given by 

{a{y)\7^u{x,y)) + {v{y) ■\/^u{x,y))u{x,y) =0, x e D, 

u{x,y) =Ub{x), X G dDd, (39) 

^u{x,y) = 0 , xGdDn, 

where y G {^fc=i[—1,1] and 


i(y) = 


1 


200 -b IQOys 


v{y) = 


1 + 0 . 22/1 
O.I 2/2 


^ 0.45 rO-S 


/4(y) = 


u{x, y)dxidx 2 


/0.05 J 0.6 
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Figure 7: The domain associated with the Burgers equation (I39p . the color map corresponds to the 
nominal solution y = 0. 




Figure 8 ; Applying three different construction of interpolants for / 4 (y). Left: using Clenshaw-Curtis 
nodes. Right: using Leja nodes. 


Figure [7] plots the nominal solution corresponding to y = 0. There is no available result regarding 
the analyticity of / 4 (y) or the corresponding optimal polynomial space. However, the components 
of the convection term v{y) affect different derivatives of the non-uniform solution, yi and ?/2 are 
multiplied by different coefficients, and t /3 affects the diffusion term which has very different effect 
on u(a:, y). Thus, we expect anisotropic decay behavior of f^iy). 

Figure [ 8 ] plots the result of interpolating fi{y) with three different polynomial selection schemes, 
the isotropic total degree (see Remark |4]), and Algorithm [T] ignoring and including /3. We use 
Leja and Clenshaw-Curtis nodes and the estimated parameters are given in Table [T] We observe 
results very similar to the previous examples, which shows that our approach extends to problems 
associated with nonlinear PDFs. 


Dimension 

di 

02 

ds 

$1 

$2 

$3 

Clenshaw-Curtis 

1.22 

1.53 

0.25 

-1.57 

-1.71 

0.25 

Leja 

1.18 

1.46 

0.35 

-2.37 

-2.40 

-0.76 


Table 7: Estimated ci; and (3 parameters for 
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5 Conclusion 


In this work, we have presented an adaptive algorithm for constructing a sequence of interpolants of 
a function defined over a product of one dimensional intervals and admitting analytic extension to 
poly-ellipse in complex space. Following recent results in “best M-term” approximation, we derive 
a heuristic estimate for the optimal interpolation space, which we parametrize by two vectors, one 
depending on the region of analytic extension of the function and one depending on the Lebesgue 
constant of the interpolation scheme. Traditional methods for (quasi-)optimal approximation rely 
on a priori estimates, instead, we present a procedure for constructing a sequence of polynomial 
spaces, where each space is derived from an estimate inferred from an interpolant in the previous 
space. Each interpolant is constructed using a sparse grids approach, and we present a strategy 
for selecting the tensor rules so that the resulting grid is optimal (i.e., fewest nodes) with respect 
to the polynomial space. We also present several novel interpolation rules derived from greedy 
optimization of operator norms. Our numerical experiments demonstrate that the method can 
be applied to a wide range of problems without the need for a priori estimates of the region of 
analyticity of the function, so long as the function is analytic and the one dimensional family of 
rules inducing the sparse grid exhibits polynomial growth of the Lebesgue constant. 
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